Multidimensional scaling of noisy high dimensional data

نویسندگان

چکیده

Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still widespread use today. In this paper we study MDS modern setting - specifically, high dimensions and ambient measurement noise. We show that as the noise level increases, suffers sharp breakdown depends on dimension level, derive an explicit formula point case of white then introduce MDS+, simple variant MDS, which applies shrinkage nonlinearity to eigenvalues similarity matrix. Under natural loss function measuring quality, prove MDS+ unique, asymptotically optimal function. offers improved embedding, sometimes significantly so, compared with MDS. Importantly, calculates dimension, into should be embedded.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Performance Multidimensional Scaling for Large High-Dimensional Data Visualization

Technical advancements produces a huge amount of scientific data which are usually in high dimensional formats, and it is getting more important to analyze those large-scale high-dimensional data. Dimension reduction is a well-known approach for high-dimensional data visualization, but can be very time and memory demanding for large problems. Among many dimension reduction methods, multidimensi...

متن کامل

Data Visualization With Multidimensional Scaling

We discuss methodology for multidimensional scaling (MDS) and its implementation in two software systems, GGvis and XGvis. MDS is a visualization technique for proximity data, that is, data in the form of N × N dissimilarity matrices. MDS constructs maps (“configurations,” “embeddings”) in IRk by interpreting the dissimilarities as distances. Two frequent sources of dissimilarities are high-dim...

متن کامل

Multidimensional Scaling and Data Clustering

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problems known as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data select...

متن کامل

The noisy multidimensional scaling problem: an optimization approach

Multidimensional scaling is a fundamental problem in data analysis and have a lot of applications. It’s goal is to look for an Euclidean graphic representation of a given set of data in a “low’ dimensional space (generally in IR or IR). This problem can be formulated as a nonlinear global optimization problem. To solve it, a Lenvenberg-Marquardt method is used upon different cost functions. Res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied and Computational Harmonic Analysis

سال: 2021

ISSN: ['1096-603X', '1063-5203']

DOI: https://doi.org/10.1016/j.acha.2020.11.006